On the Scalable Learning of Stochastic Blockmodel
نویسندگان
چکیده
Stochastic blockmodel (SBM) enables us to decompose and analyze an exploratory network without a priori knowledge about its intrinsic structure. However, the task of effectively and efficiently learning a SBM from a large-scale network is still challenging due to the high computational cost of its model selection and parameter estimation. To address this issue, we present a novel SBM learning algorithm referred to as BLOS (BLOckwise Sbm learning). Distinct from the literature, the model selection and parameter estimation of SBM are concurrently, rather than alternately, executed in BLOS by embedding the minimum message length criterion into a block-wise EM algorithm, which greatly reduces the time complexity of SBM learning without losing learning accuracy and modeling flexibility. Its effectiveness and efficiency have been tested through rigorous comparisons with the state-of-the-art methods on both synthetic and real-world networks.
منابع مشابه
Scalable Inference of Overlapping Communities
We develop a scalable algorithm for posterior inference of overlapping communities in large networks. Our algorithm is based on stochastic variational inference in the mixed-membership stochastic blockmodel (MMSB). It naturally interleaves subsampling the network with estimating its community structure. We apply our algorithm on ten large, real-world networks with up to 60,000 nodes. It converg...
متن کاملPerfect clustering for stochastic blockmodel graphs via adjacency spectral embedding
Vertex clustering in a stochastic blockmodel graph has wide applicability and has been the subject of extensive research. In this paper, we provide a short proof that the adjacency spectral embedding can be used to obtain perfect clustering for the stochastic blockmodel and the degreecorrected stochastic blockmodel. We also show an analogous result for the more general random dot product graph ...
متن کاملA Frequency-based Stochastic Blockmodel
We propose a frequency-based infinite relational model (FIRM), which takes into account the frequency of relation whereas stochastic blockmodels ignore frequency. We also derive a variational inference method for the FIRM to apply to a large dataset. Experimental results show that the FIRM gives better clustering results than a stochastic blockmodel on a dataset which has the frequency of relat...
متن کاملModeling Overlapping Communities with Node Popularities
We develop a probabilistic approach for accurate network modeling using node popularities within the framework of the mixed-membership stochastic blockmodel (MMSB). Our model integrates two basic properties of nodes in social networks: homophily and preferential connection to popular nodes. We develop a scalable algorithm for posterior inference, based on a novel nonconjugate variant of stochas...
متن کاملAsymptotic Normality of Maximum Likelihood and its Variational Approximation for Stochastic Blockmodels
Variational methods for parameter estimation are an active research area, potentially offering computationally tractable heuristics with theoretical performance bounds. We build on recent work that applies such methods to network data, and establish asymptotic normality rates for parameter estimates of stochastic blockmodel data, by either maximum likelihood or variational estimation. The resul...
متن کامل